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Polynomial approximations to the inverse of the fcrmion matrix arc used to filter the dynamics of the upper energy 
scales in HMC simulations. The use of a multiple time-scale integration scheme allows the filtered pseudofermions to be 
evolved using a coarse step size. We introduce a novel generalisation of the nested leapfrog which allows for far greater 
flexibility in the choice of time scales. We observe a reduction in the computational expense of the molecular dynamics 
integration of between 3-5 which improves as the quark mass decreases. 



, ^ I. Introduction 

OO Studying the lattice representation of quantum chro- 

modynamics (QCD) with light quarks remains a numeri- 
,— i.cally intensive challenge, requiring large-scale computing 

resources. The quark fields in the path integral can not be 
»- H .manipulated easily on the computer and the most practi- 
Q^cal means of proceeding is to integrate them analytically 

and deal with the resulting fermion matrix determinant 
*^ , stochastically. After this step, a path integral over the 

gauge fields alone remains and since this is an ordinary 
t— I integration problem, it can be tackled by the Monte Carlo 
^ method. At present, the best means of proceeding is to use 
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an importance sampling technique, with a Markov chain 



X 



01 S au S e field configurations being generated and used for 
subsequent stochastic estimation. The most widely used 
^technique is the Hybrid Monte Carlo (HMC) algorithm 
. Here, a fictitious time co-ordinate is introduced along 
t— I with momentum degrees of freedom conjugate to the gauge 
fields and a Hamiltonian describing evolution in this new 
simulation time. 
^ ■ The difficulty with manipulating quark fields haunts 
this technique, making each application of the Markov pro- 
^3 cess computationally intensive. Over the past ten years, 
'the intensive use of Hybrid Monte Carlo in large-scale pro- 
duction runs means a good deal of practical experience has 
been gained. This experience has exposed the difficulties 
with approaching the physical quark mass and continuum 
limits of lattice QCD. Recent empirical observations Q 
suggest the different topological sectors that contribute 
to the QCD vacuum are connected only weakly via the 
Markov transitions of most commonly used versions of 
HMC. A theoretical study of the approach to the con- 
tinuum limit suggests that HMC is a non-renormalisable 
algorithm 0] , which implies the computational cost of gen- 
erating gauge field configurations with smaller lattice spac- 
ings is unpredictable. 

These observations and unresolved questions strongly 



suggest that developing new ideas and modifications to al- 
gorithms such as HMC remains useful. There has been a 
great deal of activity over the past ten years developing the 
toolkit for numerical simulations, including Hasenbusch's 
mass preconditioning 0, RHMC the use of different 
integrator schemes [y, [7|, |8| and domain decomposition via 
the Schwartz alternating procedure 0, . Some of these 
developments have enabled the first studies close to the 
theory parameters needed to make contact to physical data 
reliably (llj . There is still significant activity aiming to un- 
derstand and improve HMC still further 12|, gL A review 
of recent developments can be found in Ref. 13j. 

When using HMC, there are two obstacles to pushing 
down the quark mass in the simulation. Both stem from 
the fact that the fermion determinant must be represented 
stochastically using pseudofermions with a non-local ac- 
tion that features the inverse of the fermion matrix. The 
first hindrance is that the condition number of the fermion 
matrix increases at lighter quark mass, causing a large in- 
crease in the number of conjugate gradient iterations re- 
quired to solve the linear system. Solving this system is 
required at each molecular dynamics step, which means 
the numerical expense of the inversions is the dominant 
cost for generating dynamical gauge fields. The second 
hindrance, an amplification of the first, is that the molec- 
ular dynamics step-size must be decreased as the quark 
mass is reduced in order to maintain control over the rapid 
fluctuations driven by the pseudofcrmion field. 

Polynomial approximations to the inverse have been 
used to define fermion algorithms for some time fl4l . Il5| . 
Previous explorations in the Schwinger model fl6l ] showed 
that the introduction of a polynomial filter gave a cheap 
means of controlling the short-time-scale fluctuations in 
the molecular dynamics integration of HMC by introduc- 
ing a separation of time scales in the molecular dynamics 
by directly factoring the pseudofermion action into multi- 
ple parts. In this paper, we describe the polynomial filtcr- 
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ing algorithm and present an investigation of its behaviour 
when applied to Monte Carlo studies of QCD. The paper is 
organised as follows: in Sec. [5]the method is described and 
some details of our implementation are discussed. Simu- 
lation results are presented in Sec. [3] and conclusions are 
drawn in Sec. |4] 

2. Polynomial Filtered Hybrid Monte Carlo Algo- 
rithm 

The most powerful method for carrying out importance 
sampling Monte Carlo to estimate integrals over a large 
number of degrees of freedom q, of the form 



(O) = \ J Vq O(q) e 



S(q) 



(1) 



is to develop a Markov process with -^e~ s ^ as its fixed- 
point probability distribution. Here, Z normalises this 
probability measure. The sequence of configurations gen- 
erated by repeated applications of the process can then be 
used as an appropriate ensemble for importance-sampling 
estimation of (O). 

Hybrid Monte Carlo (HMC) is just such a Markov 
chain Monte Carlo technique, comprising of two compo- 
nent parts. New configurations of an extended system, 
(p, q) are proposed to a Metropolis test, and are accepted 
stochastically with probability 



min(l,e- AW ) 



based on the change in the Hamiltonian, 
U(p,q) = \Y,P 2 + S (l)- 



(2) 



(3) 



The extended system doubles the degrees of freedom, add- 
ing a conjugate variable p to each co-ordinate q. Since the 
Metropolis test ensures the probability of a configuration 
of this new system occurring in the Markov chain is given 
by exp{— H} = exp{— T(p)} x exp{— S(q)}, the separa- 
ble product of probability densities of p and q, these two 
sets of variables are independent random numbers. Subse- 
quently, the new variables p can be discarded, leaving an 
ensemble of configurations of q with the desired probability 
distribution. 

To ensure a useful acceptance probability for the Metro- 
polis test, 7i is interpreted as the Hamiltonian describing 
the dynamics in the phase space generated by (p, q) , where 
p is treated as a canonical momentum conjugate to q. In 
practical simulations, it is the molecular dynamics that re- 
quires most of the computer time for the Markov chain to 
evolve. 

This construction strongly suggests that improving the 
efficiency of the algorithm requires both accelerating and 
improving the accuracy of the molecular dynamics integra- 
tion process. As with many complex systems, the classical 
dynamics of the degrees of freedom of lattice QCD incorpo- 
rates interactions with a broad range of characteristic time 



scales. Any attempt to improve the performance of molec- 
ular dynamics will be aided by describing a method that 
allows the separate dynamical scales to be treated differ- 
ently. As a starting point, consider the multiple time-scale 
scheme of Sexton and Weingarten [17j . In this construc- 
tion, an integrator which captures the different dynamical 
scales of different parts of the action can be defined. Inte- 
grators to evolve the system in the new time co-ordinate 
are constructed from two basic time-evolution operators, 
generated by kinetic and potential energy terms. Their 
effect on the system co-ordinates, (p, q) is 



V T (h) : {p, q} — > {p, q + h p}, 



(4) 



and 



V s (h):{p,q}^{p-hdS,q}, (5) 

where dS is the "extended force" due to the action S. The 
simplest time-reversible integrator is then built from the 
leap-frog scheme, 



V{h) = V s {\)V T {h)V s {\), 



(0 



and repeated applications of this building block evolves the 
system with the Hamiltonian conserved up to corrections 
of 0{h 2 ). 

2.1. Multiple time-scales in molecular dynamics integra- 
tors 

If the action and thus the Hamiltonian is split into two 
parts Hi and H 2 , 



Hi 



S 1 (q) + S 2 (q), 
' H 2 



(7) 



where Si is an action capturing the high-frequency modes 
and S 2 captures the low-frequency ones, then a generalised 
leap-frog integrator can be written. The two leap-frog in- 
tegrators for the two Hamiltonians arc defined as 



Vi(h) = V Sl (~)V T (h)V Sl (^), 



(8) 



V 2 (h) = V S2 (h), (9) 

and a compound integrator for the full Hamiltonian can 
be constructed by combining the two components: 



V(h) = V 2 (-) 



m 



v4), 



(10) 



where m £ N. This compound integrator effectively intro- 
duces two time-scales into the evolution, h and h/m. 

Since the force term for Si must be evaluated many 
more times than that for S 2 , the method will only be use- 
ful if two conditions are met simultaneously. First, as al- 
ready suggested, the two actions must effectively split the 
dynamical scales into high and low-frequency parts in Si 
and S2, and also, the evaluation of dSi must be computa- 
tionally much cheaper than that for dS 2 ■ 
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2.2. HMC for Lattice QCD 

When using HMC to generate an ensemble of gauge 
configurations for importance sampling Monte Carlo esti- 
mation of the path integral of lattice QCD, a few details 
arise. The first is that the integration variables are ele- 
ments of the gauge group, SU(3), and the gauge invariant 
Haar measure must be used. This can be achieved by 
making the conjugate momenta elements of the Lie alge- 
bra of the gauge group, and by modifying the action of Vr 
slightly; 

V T (h):(P,U)^(P,e ihP U). (11) 

Correspondingly, the effect of the evolution operator Vs 
must then be 



where 



with 



V s (h) : (P,U) - 
„ A- At 



2i 



{P-hS,U), 



ImTrA, 

N 



(12) 



(13) 



ReTr <^ A 



dU 



(14) 



dS 

— ^„ ^ 

While this defines molecular dynamics on the group 
manifold, the action for QCD with the quark field dynam- 
ics included requires some manipulation. After integrating 
out the Grassmann variables representing the quark fields, 
the importance sampling probability measure for two de- 
generate flavours of quarks becomes 

V [U] = - det 2 M[U] e- s °M = L e -SMW. (15) 

z z 

The effective action for this measure is given by 

S eS [U] = -2Tr \ogM[U] + S G [U]. (16) 

Unfortunately, evaluating this action and the force terms 
that arise from the molecular dynamics is very cumber- 
some on the computer. A solution requires the introduc- 
tions of "pseudofermions" , which represent the fermion de- 
terminant as a Gaussian integral; 



det 2 M[U] = \V(j>V<i>* 



(17) 



and the new effective action for the system is 

S eS [U,4>,ct>*] = <l>*K- 1 4> + S G [U], (18) 

where K = M'M. The 75-Hermiticity property of the 
fermion matrix has been exploited here; since M t = 75M75, 
det Aft = detM and hence det K = det 2 M. The molec- 
ular dynamics force term for the pseudofermions can now 
be computed; this requires the application of the inverse 
of the fermion matrix. While this step is computationally 
very expensive, it is at least tractable. Since the action 
has two components, it seems natural to use the Sexton- 
Wcingartcn scheme to integrate both parts with different 
time-scales appropriate to their dynamics. Unfortunately, 
at light quark masses the high-frequency dynamics occurs 
in the pseudofcrmion action, which also has the most ex- 
pensive force evaluation. 



2. 3. The polynomial filtered hybrid Monte Carlo algorithm 

The requirement that the integrator for the shortest 
time-step has a force term that is computationally cheap 
to implement leads to using low-order polynomial approx- 
imations to the inverse of the fermion matrix. Clearly, a 
short polynomial satisfies the condition that its force is 
cheap to evaluate. The hope is that such a simple approx- 
imation might be able to mimic the short-distance part of 
the propagator, and so capture the high-frequency dynam- 
ics expected in this part of the system. 

Making use of the identity which holds for any polyno- 
mial, V of the (two flavour) fermion matrix 



detif 



det KV(K] 
detV{K) 



(19) 



suggests introducing two auxiliary integrals for each of 
these determinants separately, 



det 2 M= Vcj)V4>*VxDx* e 



-4>*(VK)-~>-4>-x"Px 



(20) 



The action for QCD then becomes 

S mcr [U}^r(PK)- 1 <f> + X *Px + S G [U}. (21) 
This action is separated into two parts, Si and S2, 

S 1 = <ff(VK)- 1 <t>, (22) 



S 2 =X*PX + Sg[U}. 



(23) 



In the limit that a perfect representation of the inverse is 
constructed, i.e. KV(K) — > /, the first action becomes in- 
dependent of the gauge fields and so induces no molecular 
dynamics force. Conversely, as the order of the polyno- 
mial reduces, V — > I and then the system reproduces the 
pseudofcrmion action of Eqn. (|18[) . The polynomial can 
thus be used to capture as much or as little of the dy- 
namics of the fermion action as is necessary to extract the 
high-frequency modes as cheaply as possible. 

2.4- Adding additional time-scales. 

The determinant ratio identity of Eqn. (|19|) can be ex- 
tended to make use of more than one polynomial 



det AT = det KV 2 (K) x 



1 



detV{\K)V 2 {K) det Pi (AT) 

(24) 

The order of the first polynomial, n\ is chosen to be much 
less than that of the second, n-2 ^> n\. For this con- 
struction to be useful, the two polynomials should be con- 
structed so all the roots of the first are contained in the 
set of roots of the second. In this case, V2 can be written 
as the product of two polynomials, 



V 2 =ViQ 



(25) 



where the order of the new polynomial Q is n 9 = n% — n\. 
Now, the fermion determinant can be re-expressed using 



3 



three integral representations, and the resulting action for 
QCD becomes 

S 2 -poly = WiK)- 1 ^ + X *2$QX2 + XlAVlXl + S G . 

(26) 

Judicious choice of the two polynomials, V\ and V<x should 
capture the dynamics of successively increasing time-scales, 
and allow a generalised version of the Sexton- Weingartcn 
scheme to be employed. 

2.5. Polynomial approximation to — 

Naturally, in order to be able to perform Monte Carlo 
simulations V(K) must be a real polynomial and hence 
possess roots in complex conjugate pairs. In this work we 
use a Chebyshev approximation to to l/z, 



Pn(z) 



k=l 



Z k ) « -. 



The roots are given by \\A 

z k = - cos 6^) - 



v^- sm U k , 



with 6 k =&gr. 
K n+1 



The normalisation is then 



(27) 



(28) 



(29) 



The polynomial possesses roots which lie on an ellipse in 
the complex plane. The parameters /i and v should be 
chosen such that the spectrum of K lies within the el- 
lipse. Within this constraint one is free to tune the two 
parameters. For the choice /j, = 1, v = the polynomial 
coincides with that given by the truncation of the Taylor 
series expansion of l/z about z = 1. 

2.6. A generalised multi-scale leap-frog integrator 

For a two flavour simulation with up to two polynomial 
terms in the Hamiltonian, we introduce the possibility of 
simulating at four different time scales, one for each of the 
terms involved, 



Si — Sg, 
S3 = X2QX2, 



5*2 = X*iPiXu 

Si = (f>* 2 { (v 2 K)- V 2f . 



Associate a timcstcp hi to each term Si and a correspond- 
ing integer Ni such that hi = 1/iVj. hi corresponds to the 
step-size at which the gauge fields are updated. Assum- 
ing that Ni > Nj for i < j, the Sexton- Wcingarten nested 

leapfrog algorithm then requires that iVi| JV* 1 Vi > 1. The 

restrictions this places on the various JVj may not be the 
most efficient or flexible way of performing the molecular 
dynamics integration. 

A generalised scheme in which the only require- 
ment is 

N i \N 1 Vi>l, (30) 



can be defined. In a standard leapfrog algorithm, one alter- 
nates between updates Vt to the gauge field U and updates 
Vs to the conjugate momenta. Let Vi denote the update 
to P corresponding to the action Si. Now, as the guide 
bosons are held fixed during an integration the updates Vi 
only depend upon the gauge field. As the updates Vi are 
additive to P, it follows that the different V commute: 

*5(y)YKW(y) - ViihiWihi) = VMVjfa). (31) 



Define the integers 



mi = Ni+Ni 



(32) 



to be the ratios of the scales. In order to construct our 
reversible integrator we first define a map 

f V if m\k 

G[V;m,k e N] = i , ^ (33) 

I /(the identity) otherwise. 

Let tot be the lowest common multiple of {to;}, and let 
hx be the smallest time step (in our case /it = hi). Then 
our integrator is 



v(h) = l[vq)x 

i 

niT — 1 

v T (h T ){l[eiv(h t ) 7 m u k}} 



k=l 



V T (h T )Y[V(f), (34) 



where h = rnrhr is the total timestep taken by V. While it 
appears cumbersome here, the above expression is straight- 
forwardly implemented in software. We demonstrate this 
with a pseudocode implementation here. Denote by {a = b 
mod to} the usual notion of congruence modulo m. Then 
we can implement the generalised integrator as follows: 

• For each term in the action Si perform an initial 
half-step Vi(^hi) updating P. 

• Loop over j = 1 to N — 1 

— Apply Vr(h) to update U. 

— If {0 = j mod mi} apply Vi(hi) to update P 

• Apply Vr(h) to update U. 

• For each term in the action Si perform a final half- 
step Vi(^hi) updating P. 

The advantage of the generalised integrator is that it 
allows finer control when there are many different scales. 
An analysis of the finite-step size errors for the generalised 
integrator is provided in | Appendix A| 
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2. 7. Polynomial force term 

For completeness, we review here the molecular dy- 
namics force generated by the polynomial terms [TBI [20j | . 
Given a polynomial 

n 

V{K) = a n J{{K - Zi ) (35) 

and the following action 

S P = tfV(K)<f>, (36) 
we define the auxiliary fields 

Xj = H(K - Zi)h (37) 

i>j 

Vj = a n ]l(K - z*)cf>, (38) 

Then we have 




Although the above equation appears to require 0{n 2 ) ma- 
trix operations one can reduce this to O(n) by trading off 
for storage of 0(n) intermediate fields [151 ] . 

3. Simulation Results 

Results are calculated on lattices using the Wilson gauge 
action and the unimproved even-odd preconditioned Wil- 
son fermion action. In order provide a straightforward 
comparison with alternative algorithms, we chose physical 
parameters that have been used elsewhere HHEI. The 
gauge coupling is set to (3 = 5.6. We work at two quark 
masses, k — 0.1575 and 0.15825 which correspond to pion 
masses of 600 and 400 McV respectively. 

We begin by examining the effectiveness of a single 
polynomial filter (n q = 0). We measure the size of the force 
Sf due to the filtered pseudofermion term Sf and of the 
force Ep due to the polynomial term Sp. Each clement 
of the force is in the Lie algebra su(3) hence as in other 
work [221] we use the Lie norm \ \X\\ 2 = — 2TrX 2 . Across a 
configuration we measure the mean and maximum values 
of These values do not vary much from one trajec- 

tory to the next. Tuning of the parameters pi and v < /.t 
is done by choosing the values which minimise a sample 
measurement of the force on a single configuration. 

The left-hand plot in Figure Q] shows typical mean and 
maximum forces due to the (filtered) pscudofermions. We 
sec that even using very short polynomials gives significant 
reduction in the force, which continues to decrease as one 
increases the order of the polynomial n p . We note however 
that the rate of the reduction in the force is sub-linear in 
n p . 



k a b c x 2 /dof 

0.1575 0.0150(4) 0.0778(5) 0.581(2) 2.12 
0.15825 0.0271(9) 0.0187(9) 0.586(1) 15.8 

Table 1: Values obtained for the fit parameters for tq = a + 6nJ. 

The right-hand plot in Figure [1] shows typical mean 
and maximum forces due to the polynomial term. We see 
that this is roughly independent of the size of the polyno- 
mial, and is approximately equal to the value for n p = 0. 
This indicates the the size of the force is dominated by the 
shortest scale present, which strongly supports the moti- 
vation for separating the time scales. 

We can also measure the efficacy of the polynomial 
filter by considering the acceptance rate as a function of 
the pseudofermion step size. Figure [21 shows the accep- 
tance rate as a function of hp for a single polynomial filter 
(of various order n p ). The polynomial step size was kept 
approximately fixed. Fits are performed using the com- 
plementary error function 

Pace = erfc(%. (40) 

We call To the characteristic scale. The characteristic scale 
as a function of polynomial order is shown in the right hand 
plot of Figure [21 To gain some intuitive understanding, we 
can see that using a step size of half the characteristic scale 
will yield an acceptance rate of approximately 0.7. We fit 
the characteristic scale as function of ri\ as follows 

T a = a + bn c 1 . (41) 

The addition of a second polynomial filter does not 
change the qualitative behaviour of the quantities we have 
examined. To study the effects of adding a second polyno- 
mial, we fix n p = 4 and vary n q . Figure [3] shows the size of 
the force Y>p due to the pseudofermions and that due to the 
second polynomial term Sq. Even at moderate n q , 
is significantly reduced compared to its unfiltered value, 
by a factor of between 10-20. We see again that |£f| de- 
creases sub-linearly with n q and that |£q| is essentially 
independent of n q , being dominated by the shortest scale 
present. Figure U shows the acceptance rate as a function 
of Hf for a 2-filter system (with n p = 4 and various n q ) 
on the left, and the characteristic scale To as a function of 
TI2 = n p + n q on the right. We again perform a fit for tq 
as before. 

We repeat the comparison of force sizes, acceptance 
rates and characteristic scales for our lighter quark mass 
at k = 0.15825. Results are shown in Figures [5] and |B] for 
a variety of polynomial filters. We see the same quali- 
tative behaviour in these results as those for k = 0.1575 
discussed above. In the right hand plot of Figure [5] fits for 
the characteristic scale for both masses using combined 
data for the single and dual polynomial filter results are 
shown. The values obtained for a, b, c are given in Table [TJ 
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Figure 1: The size of the force due to the pseudofcrmions ||£f|| (left) and that due to the polynomial term ||Sp|| (right), as a function of 
(single) polynomial filter order n p . The maximum and average force size is shown for n = 0.1575. For comparison, the values for with 
n p = are also shown as the leftmost darker coloured bars in the graph for ||£p|| (right). 
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Figure 2: (Left) Trajectory acceptance rate as a function of pseudo-fermion step size for a single polynomial filter of various order n p . (Right) 
Fit of the characteristic scale to as a function of polynomial filter order ri2 = n p . Results are shown for n = 0.1575. 



In order to measure the relative numerical cost of the 
different choices for n p , n q we calculate the total number of 
Dirac matrix applications needed per molecular dynamics 
trajectory at an acceptance rate of 0.7, and we denote this 
value by Do. 7- Inverting Eq . HU1 allows us to determine the 
pseudofermion step size hp that corresponds to p acc = 0.7. 
Keeping the polynomial step sizes approximately fixed at 
Np w 120, Nq ps 30, it is then a simple matter to calculate 
Do. 7- Figure[7]shows Do. 7 a s a function of polynomial order 
n.2- We see that although there is a large initial reduction 
in cost, Do. 7 becomes nearly flat after about n-x = 4. The 
reason for this is that our polynomial of order n,2 reduces 
the condition number of the Dirac matrix by less than a 
factor of (na + 1 ) ~ 1 ■ As each iteration of the conjugate gra- 
dient routine requires n-i + 1 applications of the Dirac ma- 
trix, we can see that the cost of inverting the filtered Dirac 



matrix is increasing. This is offsetting the gain we obtain 
by having to invert less often. A potential improvement 
which could prove particularly useful for larger polynomi- 
als is to make use of a linear multi-shift inverter [23L EiL 12BI] 
to evaluate the action of the inverse of MV . 

However, at large 712 the force contribution from the 
fermions is very small indeed. This allows us to slacken the 
inversion target residual during the molecular dynamics 
integration. Slackening the residual from 10 -7 to 10~ 4 
significantly reduces Do. 7. This is equivalent to working 
with an approximate Hamiltonian during the integration, 
and hence may put an upper bound less than one on the 
acceptance rate. This effect is observed for example if we 
try r = 10~ 3 . 

As an aside, we also performed some tests using a vari- 
ant polynomial filter applied to the Wilson operator M 
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Figure 3: The size of the force due to the pseudofcrmions ||£f|| (left) and that due to the polynomial term ||Sq|| (right), as a function of 
polynomial filter order n q with n p = 4 fixed. The maximum and average force size is shown for k = 0.1575. For comparison, the values for 
||Sp|| with n p = 4, n q = are also shown as the leftmost darker coloured bars in the graph for ||Sq|| (right). 
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Figure 4: (Left) Trajectory acceptance rate as a function of pseudo-fermion step size for a dual polynomial filter of fixed order n p = 4 and 
various order n q . (Right) Fit of the characteristic scale to as a function of polynomial filter order n2 = r, 
n = 0.1575. 



directly rather than K = M^M. However, as one goes to 
light quark masses we find that the spectrum of M (which 
is complex) intrudes upon the boundary of the elliptical 
region defined by the roots of the polynomial, making this 
alternative method unfeasible. 

4. Conclusions 

The use of polynomials to separate different time scales 
in the molecular dynamics integration step of the HMC 
algorithm applied to lattice QCD with dynamical quark 
fields was introduced. This shifts the most expensive force 
terms to the coarsest scale (and vice versa) , allowing mul- 
tiple time scale integration schemes to be effective. The 
procedure is extensible allowing not only the separation 



of UV and IR dynamics, but also the introduction of in- 
termediate scales. A novel generalisation of the leapfrog 
integrator was introduced which allows for far greater flex- 
ibility in choosing the time scale that one associates with 
the dynamics induced by a particular term in the action. 
The integrator is applicable to any simulation that makes 
use of multiple time scales. 

Polynomial approximations to the inverse were shown 
to be successful UV filters for two flavour simulations. One 
possible improvement is to use a multi-shift solver to calcu- 
late the inversion of MV. This should prove most advanta- 
geous when dealing with polynomials of larger order, which 
occur when using an intermediate filter. The use of a poly- 
nomial filter can be applied to single flavour simulations 
using a variety of implementations. A detailed descrip- 
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Figure 5: The size of the force due to the pscudofcrmions ||Sf|| (left) and that due to the polynomial term ||£„ j H || (right), as a function of 
polynomial filter order. Results are shown for various n p , with two results for n q = 20, 30 for which n p = 4. The maximum and average force 
size is shown for n = 0.15825. For comparison, the values for ||Ef|| with n p = 0,n 9 = are also shown as the leftmost darker coloured bars 
in the graph for ||S poly || (right). 




Figure 6: (Left) Trajectory acceptance rate as a function of pseudo-fcrmion step size for a dual polynomial filter of various order n p and n q , 
with k = 0.1575. (Right) Fit of the characteristic scale tq as a function of polynomial filter order ri2 = n p + n q . Combined results are shown 
for both k = 0.1575 and k = 0.15825. 



tion and comparison of these techniques are the subject of 
future work. 
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Appendix A. Integrator Analysis 

We perform an error analysis of our generalised integra- 
tor for a simple choice of stepsizes, following the procedure 
in 17| . Given a Hamiltonian 7i we can write the evolution 



operator for our system as exp/iH, with stepsize h. Here 
we have defined H as the linear operator on the vector 
space of functions / on phase space (p, q) defined by the 
Poisson bracket 



^ \ dp t dqi dq t dp t 



(A.l) 



If we write the Hamiltonian as 

H = T + Si + S 2 + S 3 + S 4 



(A.2) 
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Figure 7: The cost Do. 7 as a function of polynomial filter size ri2, for re = 0.1575 (Left) and re = 0.15825 (Right). 



then for each term in the Hamiltonian we can correspond- 
ingly define a linear operator using the Poisson bracket 
relation above. 

Proceeding with the error analysis, we make use of the 
Bakcr-Campbell-Hausdorff result, 



we obtain 



h O h rp h Q h rp h Q h rp h Q 

e« Sl e3 ea Sl e3 e^ bl e^ e« bl 



exp (hH, + h^—nsunn + n^[[Si,nsi])), 



108 



216' 



e xA e x ^e xA = exp (\(2A + B) 



(A.7) 



and 



-([[A,B],A} + [[A,B},B})+0(\ 4 ]) (A.3) u S hf h§ hf , § 

6 / e 4 * 2 e 2 1 e 2 ° 2 e 2 e** 2 = exp [hH 2 



and apply this to the generalised leapfrog integrator in the 
simple case of TL = T + Si + S 2 , where the time scale for 
each term in TL corresponds to a number of integration 
steps Nt = 6,Ni =3 and N 2 — 2 respectively. The 
integrator for this simplest non-trivial case can be written 
as 



ho ho h rp h ci h rp ho h rp ho h rp ho ho 

1" 2 e« e* e 2 1 e T 



(A.4) 



Repeated application of our BCH result allows us to de- 
duce that the above expression can be written as 



1 



1 



exp hH + ^{-[{S^ T], T] + - [[S 2 , T],S 2 ] 



^48 



96' 



^ [[§! , f ] , 5i] + ^ [[§! , f] , f] + ^ [[S 1} f] , S 2 }) 

(A.5) 

From this expression we can immediately see that the error 
in the generalised integrator relative to the leading term 
is 0(h 2 ), just as for the regular leapfrog. 

If we examine the individual leapfrog integrators cor- 
responding to 



H X = T + S X , H 2 =T + S 2 



(A.6) 



hH^[[ s ^nf} + -[[s 2 ,ns 2 })) (a. 



k 48 



96' 



Hence we see that the only difference between the in- 
dividual integrators and our generalised integrator is the 
cross term [[Si , T] , S^]- The algorithm is identical to a stan- 
dard nested leapfrog in the case where iVj|iV,_i. 
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